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The  paper  describes  a  completed  and  independent  module  of  a 
large-scale  system*  the  Quasi-Optimizer  (fifi).  The  flQ  system  has 
three  major  objectives:  (i)  to  observe  and  measure  adversaries' 
behavior  in  a  competitive  environment*  to  infer  their  strategies 
and  to  construct  a  computer  model*  a  d£S£CifiiinC  IbCGCX*  of  each; 

( i i )  to  i dent i fy  strategy  components*  evaluate  their 
effectiveness  and  to  select  the  most  satisfactory  ones  from  a  set 
of  computed  descriptive  theories*  (iii)  to  combine  these 
components  in  a  quasi-optimum  strategy  that  represents  a 
QQCBitiue  tbeocx  in  the  statistical  sense. 

The  measurements  on  the  input  strategies  can  take  place 
either  in  a  sequence  of  confrontations  unperturbed  by  the  flfl  or* 
for  efficiency's  sake*  in  a  series  of  environments  specified 
according  to  some  experimental  design.  The  module  completed 
first*  flQrl*  can  perform  the  experiments  either  in  an  exhaustive  i 
manner  —  when  every  level  of  a  decision  variable  is  combined 
with  every  level  of  the  other  decision  variables  —  or*  in 
relying  on  the  assumption  of  a  monotoni cal ly  changing  response 
surface*  it  uses  the  binary  chopping  technique. 

The  module  discussed  here*  flQ;3*  does  not  assume  monotonic 
response  surfaces  and  can  deal  also  with  multidimensional 
responses.  It  starts  with  a  (loosely)  balanced  incomplete  block 
design  for  the  ^  experiments  and  computes  dynamically  the 
specifications  for  each  subsequent  experiment.  Accordingly*  the 
levels  of  the  decision  variables  in  any  single  experiment  and  the 
length  of  the  whole  sequence  of  experiments  depend  on  the 
responses  obtained,  in  previous  experiments.  In  general*  is 

an  on-line*  dynamic  generator  of  experimental  design  that 
minimizes  the  total  number  of  experiments  performed  for  a 
predetermined  level  of  precision. 
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1.  ItUBQ&ULIlQti 

The  Quasi-Optimizer  (flfl)  system  Cl#  2]  observes  and  measures 
adversaries'  behavior  in  a  competitive  environment#  infers  their 
strategies#  and  constructs  a  computer  model  (a  "descriptive 
theory")  of  each.  By  evaluating  the  effectiveness  of  the 
components  of  these  strategies  and  selecting  the  most  satisfying 
ones  (credit  assignment)#  it  generates  a  "normative  theory”  which 
is  an  optimum  strategy  in  the  statistical  sense.  The  measurement 
of  the  adversaries'  behavior  can  take  place  either  in  a  sequence 
of  unperturbed  confrontations  or  under  "laboratory  conditions" 
when  the  environment  for  each  confrontation  is  specified 
according  to  some  experimental  design.  We  shall  be  concerned 
with  the  second  mode  of  operation  in  this  paper. 

The  first  Of  six  fairly  independent  modules  of  the  BQ 
system#  flQzl#  constructs  a  descriptive  theory  of  static 
strategies  given  as  black-box  programs  impenetrable  by  C33. 
It  also  identifies  which  of  all  possible  decision  variables  are 
relevant  for  the  istrategy  being  modelled.  The  program  can  use 
either  an  exhaustive  search  pattern  or  a  binary  chopping 
technique  in  the  space  of  decision  variables  while  carrying  out  a 
sequence  of  controlled  experiments  on  the  strategy.  As  an 
inductive  discovery  feature#  it  can  also  correlate  certain 
stochastic  consequences  of  the  strategy  with  subranges  of  values 
of  each  decision,  variable.  The  strategy  response  surface  is 
assumed  by  BQ-1  to,  be  weakly  monotonic. 

The  present  p,aper  deals  with  a  significant  generalization  of 
the  QQzl  program.  The  module  80=3  is  designed  to  minimize  the 
total  number  of  (experiments  while  maintaining  a  user-specified 
minimum  level  of  precision  in  identifying  the  strategy  response# 
fi#  over  the  whole  space  of  decision  variables.  The  response 
surface  need  no,t  be  monotonic  now.  The  design  of  each 
experiment#  after  an  exploratory  phase#  depends  on  the  results  of 
the  experiments  obtained  up  to  that  point. 

The  program  is  completely  general  purpose.  However# 
because  of  the  specific  content  of  the  flQ  project#  we  shall  use 
the  terms  'decision  variables'  for  the  control  variables  in 
experiments#  and  'strategy  response'  for  the  scalar  or  vector 
entity  that  is  the  outcome  of  the  experiments. 

2.  QU  SmiEfilES  mi&  OQHEUIEB  BEEBEBEttIBIlQAI 

A  strategy  is  considered  at  its  simplest  level  to  be  a 
decision  making  mechanism  that  observes  and  evaluates  its 
environment#  and  prescribes  in  response  to  it  a  single#  one-step 
action.  We  can  extend  this  concept  in  various  directions.  The 
single  (that  is#  one-dimensional)  action  can  be  replaced  by  a  tst 
fll  (that  is#  multidimensional)  ifiiiflQ*.  The  one-step  (momentary) 
action  can  be  replaced  by  a  icflueocc  fii  StliQOS'  unordered# 
weakly  or  strongly  ordered#  over  time.  Furthermore#  the  decision 
variables  defining,  the  environment  may  also  include  descriptors 
character izing  relevant  aspects  of  the  biltoex  ai  XbC 
tQXi£QQltDl*  These  ideas  make  our  studies  more  realistic  in 
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taking  into  account  mu l t i di me ns i ona l  strategy  responses  to 
complex  environments/  long-range  planning/  tactical  and  strategic 
considerations  (with  reference  to  short-term  and  long-term 
objectives/  respectively),  ue  can  study  automatically  generated 
"methods"/  in  which  goals  and  current  features  of  the  environment 
are  associated  with  sequences  of  actions/  and  "blueprints"/  in 
which  goals  are  presented  as  desired  features  of  the  environment. 
Ue  also  distinguish  between  ££j£i£  and  dxoilifi  HXJltfliCi*  The 
latter  are  e  i  the  r 1  con  t  rol  l  ed  by  a  iCACQiQfl  ■££b£Qill  (to  improve 
performance  or  to5 adapt  to  new  environments)/  or  exhibit  periodic 
or  random  (See  £43  for  a  detailed  discussion.) 

We  have  selected  the  dfifiixiQQ  l£££  (DT)  as  an  efficient  and 
effective  representation  of  simple/  single-action  strategies  £13. 
(See  fig.  1.)  wit  have  also  shown  that  DT s  are  equivalent  in 
power  to  production  systems  but  can  be  modified  more  easily  and 
their  scope  of  representat iona l  validity  can  be  extended  as 
needed.  These  extensions  are  as  follows: 


FIGURE  1  ABOUT  HERE 


.When  the  strategy  response  is  a  vector  quantity/  each  of 
its  components  requires  a  separate  DT.  (we  are  currently 
studying  techniques  to  eliminate  any  redundancy  inherent  in  cases 
in  which  the  vect Or  components  are  correlated.) 

•A  time-sequence  of  actions  can  be  attached  to  the 
leaf-level/  instead  of  one-step  strategy  responses/  to  describe 
the  result  of  strategic  planning. 

.Judiciously  chosen  decision  variables  can  characterize  the, 
relevant  aspects/of  the  history  of  a  confrontation  or  of  the' 
development  of  an i envi ronment . 

•A  learning  /strategy  is  represented  by  a  sequence  of  DTs/ 
each  being  a  "snapshot"  taken  of  the  strategy/  with  the  learning 
component  frozen/,  at  different  time  points.  We  have  devised  an 
algorithm/  the  fifi z2  module  £43/  that  computes  the  asymptotic  form 
of  the  sequence  gf  DTs/  when  the  result  is  statistically  valid. 
This  extrapolated  DT  is  then  used  as  one  of  the  input  strategies 
in  the  computation  of  the  normative  strategy. 

3.  IU£  BQzS  £B QGBAB 

! 

We  can  explain  the  flQxJ  best  by  going  through  its  phases  of 
operation  in  a  chronological  order. 

3.1.  Ifc£  Ulf  £  lOQUl 

The  whole  program  is  highly  interactive  and  relies  on  the 
user's  advice  when  feasible.  As  described  below/  the  first/ 
exploratory  phase  of  the  program  specifies  decision  variable 
levels  according  to  a  loosely  balanced  incomplete  block  design/  a 
term  to  be  explained  later.  The  user  first  has  to  input  a 
so-called  £fdU£iifiO  1A£I0£'  1/  which  is  the  ratio  between  the 

number  of  exploratory  experiments  and  the  number  of  all  possible 
experiments.  The  latter  is  the  product  of  the  number  of 
meaningfully  distinct  levels  of  every  decision  variable 
essentially  the  :£i£dioftlilX  of  the  experiment.  The  reduction 


factor  provides  the  user  with  control  over  the  usual  trade-off  in 
exper imentat ion  between  cost  and  precision. 

The  user  is  then  asked  to  specify  4fi#  the  dfiiilfid 
(or  error  tolerated).  This  is  then  considered  by  the  system  as 
the  ainiaua  discernible  difference  between  the  strategy  responses 
given  at  two  adjacent  experiaental  points  in  the  decision 
variable  space.  Therefore#  if  there  is  reason  to  assuae  a  weakly 
aonotonic  responsje  surface#  the  latter  is  considered  "flat" 
between  adjacent  points  whenever  the  response  values  at  such 
points  differ  by,  no  aore  than  is.  (Our  prograa  repeats  the 
experiments  at  the  two  points  once  aore  and  also  checks  the 
response  value  in  the  aidpoint  because  of  possible 
non-sonotoni c i ty  of  the  surface  and  the  usually  stochastic  nature 
of  the  environment.) 

Next#  the  u^er  inputs  inforaation  about  each  dfiCisiflO 
xaciAbie.  This  consists  of  its  name#  type#  range#  and  the 
initial  estiaate  of  the  number  of  levels  it  assumes.  There  are 
three  types  of  variables: 

(i)  Uuaeci£«i  in  which  case  the  range  of  values  is 
normalized  to  (0#  128).  The  user  estimates  how  sensitive  the 
strategy  response  is  to  changes  in  the  variable  in  question. 
Higher  sensitivity#  i.e.  aore  rapid  changes#  would  require  aore 
levels  in  the  variable.  The  user  aust  also  specify  the  aaiimua 
acioiofliul  ££l£iuiiOQ  <MMR)#  which  is  the  smallest  discernible 
difference  between  the  values  of  the  variable.  In  other  words 
the  grid  size  along  that  dimension  must  be  at  least  as  larqe  as 
MMR. 

(ii)  QcdtCfid  £Ai£flQ£i£Al  XACilblCS  assuae  symbolic  values 
which  are#  by  their,  nature#  ordered.  Examples  are  rank  numbers# 
the  days  of  the  week#  musical  notes#  even  colors  when  their 
respective  wave  lengths  have  some  significance.  The  user  may 
enter#  for  example#,  ((COLOR  (RED  0RAN6E  YELLOW  GREEN  BLUE  INDIGO 
VIOLET)).  The  system  again  maps  the  range  of  the  user-specified 
values  onto  (0#  128).  He  also  provides  an  MMR  value  to  express 
how  "influential"  t)he  variable  is  with  regard  to  changes  in  the 
strategy  response.  ,  Wisely  used#  the  user  can  control  through  MMR 
the  number  of  experiments  until  more  information  becomes 
available  about  the  nature  of  the  response  surface.  The  highest 
number  of  levels  of  a  numerical  or  ordered  categorical  variable# 
NL#  is  related  to  MMR  as 

,  MMR  ■  128  /  (NL  -  1) 

(iii)  Uflflfdtctd  &ll£ad£i£Al  XifiablSl'  too#  assume  symbolic 
values  but  these  have  no  meaningful  order.  The  user  may#  for 
example#  specify  ((ANSWER  (YES  NO))  (SPICES  (SALT  PEPPER 
PAPRIKA))).  There  is  no  MMR  specifiable  here.  Unordered 
categorical  variables  are  treated  differently#  all  levels  given 
are  used  exhaustively#  as  explained  below. 

We  should  point  out  that  BQsJ  is  robust  enough  to  rectify 
user  errors  about  the  importance  of  individual  decision 
variables.  The  program  will  trim  and  add  levels  as  the 
experimental  ion  proceeds  and  the  shape  of  the  response  surface 
emerges.  However#  time  and  cost  of  experiments  are  saved  when 
the  user*s  estimate!  are  sound. 
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Finally#  the  system  computes  all  acceptable  bASesuoil  values 
and#  if  there  is  more  than  one#  it  asks  the  user  to  select  one. 
The  base-unit  is  '  the  greatest  common  divisor  of  the  number  of 
levels  of  all  numerical  and  ordered  categorical  variables.  (The 
unordered  categorical  variables  are  always  exhaustively 
searched.)  An  'acceptable*  base-unit  is  usually  a  compromise 
representing  the  smallest  number  of  levels  added  to  those 
originally  specified  by  the  user#  over  all  affected  variables. 
3.2.  Biatk  fieaiao  let  the  LabUcaiqcx  ehtse 

The  tuiioeed  aotfliDiele  biflti  dceisn  <bibd>  is  used  in 
controlled  experiments  to  reduce  their  total  number  while 
maintaining  the  symmetry  of  effects  of  two  individual  and 
potentially  interacting  independent  variables  on  one  dependent 
variable  (see  £53  for  details).  Unf ortunately#  it  is  not 
possible  to  construct  a  BIBD  for  any  number  of  levels  even  in  the 
two-dimensional  case#  and  the  constraints  employed  have  no 
obvious  counterparts  in  higher  dimensions.  These  reasons  have 
led  us  to  its  generalized  concept#  the  iofidtix  balaO£Cd 

biatk  di-Siac  (lbibd). 

LBIBD  ensures  that  a  statistically  reliable  sample  is 
selected  of  all  possible  combinations  of  the  levels  of  the 
decision  variables.  The  size  of  the  sample  is  the  fraction# 
specified  by  the  user  as  the  reduction  factor#  of  the  cardinality 
of  the  experiment.  The  design  must  satisfy  two  constraints: 

(i)  Each  level  of  a  variable  appears  (approximately)  equal 
number  of  times# 

(ii)  Each  level  of  a  variable  is  combined  with  each  level  of 
another  variable  (approximately)  equal  number  of  times#  for  all 
pairs  of  variables* 

The  term  "loosely  balanced"  is  due  to  the  fact  that  another 
rule  concerning  the  symmetry  between  multiple  co-occurrences  of 
levels#  satisfiable  only  in  certain  instances  of  the 
two-dimensional  problem#  has  been  relaxed  and  used  only  when 
possible.  The  following  concepts  will  be  necessary  in  explaining 
the  other  phases  of, 

Let  the  reduction  factor  be  given  as  a  fraction  of  lowest 
terms#  f»a/b#  the,  size  of  the  base-unit  be  q ;  the  number  of 
decision  variables  jof  the  numerical  and  ordered  categorical  types 
be  d#  and  the  number  of  unordered  categorical  variables  be  g. 
Let  us  also  define  a  few  terms.  A  'chip*  consists  of  (d-1) 
indices  (or  level  ,  numbers)  where  an  index  value  falls  in  the 
range  (1#  n)  inclusively.  A  'basic  block'  consists  of  f*n**(d-1) 
chips.  An  'extended  block'  is  the  result  of  'spawning'  the 
appropriate  chips  of  a  basic  block.  'Spawning'  means  repeating 
the  chip  along  the  dimension  of  a  decision  variable  whose  number 
of  levels  is  a  multiple  of  the  base-unit.  A  'test  vector'  is 
determined  by  (d*u)  indices  and  represents  the  specification  of 
one  experiment.  Finally#  the  'initial  test  basis'  consists  of  a 
set  of  test  vectors  computed  by  the  LBlBD-generator  for  the 
exploratory  phase  of  Sfi=3« 

Using  number-theoreti cal  arguments#  it  can  be  shown  that  if 
xf  is  an  index  of  the  i-th  variable#  then  the  indices  of  the 
To-1>  variables  of  a  basic  block  must  satisfy 
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( *  ♦  *  ♦  .....  ♦  x  )  mod  b  <  a  (1> 

1  2  d-1 


Inequality  (1)  defines  stripes  perpendi cular  to  the 
principal  diagonal  of  the  block.  These  stripes  can  be  eliminated 
and  the  experiments  "randomized"  (spread  out)  with 

Cp(x  )  4  p(x  )  4  .....  4  p ( x  )3  mod  b  <  a  (2) 

12  d-1 

where  q  is  a  permutation  operator  on  the  values  0/  1*  ...  b-1 . 

We  have  chosen  to  use  the  multiplying  factor  (d-i)  for  x^  as  the 
respective  permutation  operator. 

3.3.  Sfio&iiiiiii&o  as  iba  Eaeiacaian*  Sbasa 

The  process  o;f  sensitization  is  the  exploratory  phase  of 
Its  task  is1  to  find  out  where  the  initial  test  basis  has 
to  be  refined  alohg  the  dimension  of  every  decision  variable. 
(Note  that*  under  (ideal  conditions*  the  final  grid  is  such  that 
the  difference  in  response  values  at  adjacent  points  is 
idaotiailix  equal  to  4fi  over  the  whole  domain  of  decision 
variable  space.)  We  describe  the  process  and  the  underlying 
heuristic  throuqh  an  example.  Suppose  one  of  the  g  variables* 
vj  *  was  specified  by  the  user  to  have  five  levels  initially.  Its 
normalized  values  are  (0  32  64  96  128).  Also*  assume  it  has  an 
MMR  of  8.  The  system  first  considers  the  levels  0  and  32.  The 
extended  block  specifies  sets  of  values  of  the  other  (d-1) 
decision  variables  at  which  experiments  are  performed  while  the 
value  of  vt  is  held  at  0  and  32*  respect i vely.  Accordingly*  two  » 
groups  of  response  values  are  obtained*  one  for  vj  »0  and  the 
other  for  v;  >32.  The  program  forms  the  average  of  each  group  of 
values.  If  the  difference  between  the  two  averages  is  less  than 
4&*  the  subrange  <0*  32)  is  "monot oni ca l ly  sensitized"*  i.e. 
there  is  no  need  to  refine  it  if  the  response  surface  is  assumed 
to  be  weakly  monotonic.  If  this  assumption  is  not  held*  the 
midpoint  v,**16  is  selected.  The  corresponding  average  response 
value  is  then  compared  with  those  for  the  two  endpoints  of  the 
subrange.  If  the  respective  differences  are  both  less  than  >Ag* 
the  subrange  is  "completely  sensitized".  Otherwise*  the  midpoint 
is  added  to  the  values  of  V;  as  a  level  to  be  used  in  the 
completing  phase  of,  experime'nTat  ion.  The  subranges  are  halved 
further  whenever  the>  results  warrant  it  —  as  tong  as  the  length 
of  the  subrange  is  no  less  than  MMR*  in  our  example  8.  The  same 
procedure  is  followed  by  all  subranges  of  ^*  and  then  for  each 
of  the  other  decision  variables. 

Finally*  we  note  that  the  response  values  are  naturally  kept 
after  their  averages  are  formed  --  they  are  needed  also  in  the 
completing  phase  of  experimentation. 

3.4.  ies  tflifliiiioa .  Efiii*  ai  E* eiriacomiao 

When  all  decision  variables  have  been  sensitized*  the 
experiments  specified  by  all  computed  test  vectors  are  performed. 
(There  is  no  saving  possible  for  the  unordered  categorical 
variables.  The  whole  process  has  to  be  repeated  for  each  value 
of  every  such  variable.) 


Finally#  builds  a  oT  of  the  results  of  coaputat ions. 

The  paths  go  through  the  '  subrange  of  the  variables#  and  the 
response  value  is  attached  to  the  leaf  level. 

4.  2QBE  BESULI5 

As  a  final  test  run#  we  have  defined  a  response  surface  as  a 
function  of  2  nuaerical  variables#  1  ordered  and  1  unordered 
categorical  variables  with  the  following  conditions: 

IF  (state  *  solid) 

THEN  IF  t*d  <  3000  THEN  response  «  JtTd 

ELSE  response  *  (d/30)Jt*d. 

IF  (state  *  liquid) 

THEN  IF  t  <  100  THEN  response  *  t*d 

ELSE  response  *  (t/100)»t*d. 

IF  (state  «  gas) 

THEN  IF  (gas  is  radio-active)  THEN  response  *  5»t»d 

;  ELSE  response  *  (5/2)*t#d. 

The  user  has  specified  the  following  values:  f  *  2/S#  AR*  2000# 

Variable  Naae  I  Type  I  Range  I  MMRl  Nuaber  of 

I  I  It  Init.  Levels 

. . . lr - I - I - I . . 

teaperature  (*t)lnumer.  1(0. .200)  I  10  I  6 

duration  («d)lnumer.  I  (0..60)  12  1  4 

state  lord.  cat.  I (sol  id. .gas)  l"64Ml  3 

radio-active  lunord.  cat • I (yes #no>  I  —  I  2 

written  in  MACLISP#  took  31  seconds  on  a  Honeywell  Level 
68/80  processor  to.  design  a  total  of  434  experiments  (out  of  3906 
possible  ones).  Tjhe  actual  maximum  difference  in  response  values 
at  adjacent  points  was  4R^  *  1728. 
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